MOVING PICTURE ENCODING SYSTEM 



BACKGROUND OF THE INVENTION 
The present invention relates to a moving picture encoding 
system, in which a moving picture is encoded with respect to each object, 
or a moving picture is divided into plural sections and bit allocation for 
encoding is decided with respect to each section, in particular, relates to 
improvement in bit rate control. 
Description of the Related Art 

In encoding moving pictures, there is a system employed by 
Test Model 5 (hereinafter referred to as TM-5) of MPEG-2 (Moving 
Picture Experts Group, Phase 2) as a coding control system, in which the 
target bit number of each frame is determined based on the amount of 
available bits, and the number of bits generated for the frame is 
controlled so as to approximate the target number (manuscript: "March, 
1993, ISO/ IEC JTC 1/ SC 29/ WG 11/ NO400"). 

Fig. 1 is a block diagram showing a conventional bit rate 
control scheme. The rate control scheme comprises a coding means 

1001, a target bit rate calculating means 1002 and a bit number model 
parameter calculating means 1006. The coding means 1001 is supplied 
with image data and outputs of the target bit rate calculating means 

1002, and outputs bit streams as the first output. Its second output 
(generated bit number) is sent to the target bit rate calculating means 
1002 and the bit number model parameter calculating means 1006, and 
the third output (coding information) is sent to the bit number model 
parameter calculating means 1006. The bit number model parameter 
calculating means 1006, where the second and third outputs of the 
coding means 1001 are inputted, sends its output (bit number model 
parameter) to the target bit rate calculating means 1002. The target bit 
rate calculating means 1002 is supplied with the second output of the 
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coding means 1001, the output of the bit number model parameter 
calculating means 1006 and available bits information, and sends its 
output (target bit number) to the coding means 1001. 

In the coding means 1001, the inputted image data are 

5 encoded under the coding control so that the number of bits used for each 
frame meets the target supplied from the target bit rate calculating 
means 1002. Subsequently, the number of generated bits and coding 
information of each frame are outputted along with encoding results 
outputted as bit streams. Here, the coding information is quantization 

10 parameters used in encoding. 

The bit number model parameter calculating means 1006 
calculates parameters for modeling the bit number. In the system of 
TM-5, this parameter is a complexity index defined as a product of the 
generated bit number and the quantization parameter. Namely, 

15 assuming that: Xi, Xp, Xb! Si, Sp, Sb5 and Qi, Qp, Qb denote the complexity 
index, the number of generated bits, and the average value of the 
quantization parameter for each of I, P and B pictures, respectively, 
following expressions (l), (2) and (3) can be formed. 

s I (1) 

s > i; (2) 

Se=^ (3> 

Obtained complexity indexes Xi, Xp and Xb are outputted to the 
25 target bit rate calculating means 1002 as model parameters. 

The target bit rate calculating means 1002 estimates the 
target bit number for each picture according to allocatable bit number 
information, the model parameter and the number of generated bits, and 
outputs the result to the coding means 1001. In MPEG-2, bit allocation 
30 is performed with respect to each GOP (Group Of Picture). First, the 



number of available bits for a GOP is found by the allocatable bit number 
information. Subsequently, the number of bits which have been used 
for coding the previous pictures in the GOP are deducted from the 
number of available bits to estimate the number of bits that are available 
for the remainder of the GOP (hereinafter referred to as R). After that, 
using predetermined constants K P and Kb, which indicate the roughness 
ratio in quantization of P and B pictures, the target bit number for the 
respective I, P and B pictures (hereinafter referred to as Ti, T P and T B , 
respectively) is calculated by following expressions (4), (5) and (6). 

T i = x ^ (4) 
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T B = =p= (6) 

B Y -rr iN P 

In (4), (5) and (6), N P and N B denote the number of P and B pictures that 
remain to be coded in the GOP. The target bit numbers are outputted 
to the coding means 1001, and used for rate control. 

The above-mentioned rate control scheme does not involve 
object shape information since it is developed for MPEG-2. For 
MPEG-4, in which object-based encoding is performed, there are 
proposed schemes disclosed in Japanese Patent Application Laid-Open 
No. 2000-50254, and on page 186 to 199 of the document "IEEE 
Transactions on Circuits and Systems for Video Technology, Vol. CSVT-9, 
No. 1, February 1999". In the schemes, a second-order rate -distortion 
curve, which is described on page 246 to 250 of "IEEE Transactions on 
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Circuits and Systems for Video Technology, Vol. CSVT-7, No. 1, February 
1997", is employed. Namely, below expressions (7), (8) and (9) replace 
(l), (2) and (3) in modeling the number of bits. 



X P D YpD 
Qp + Q P 



Sp =±^ + i^ ( 8 ) 



^ XjP + £D (9) 

Qb Q B 

In (7) to (9), D denotes the mean absolute difference (MAD) of a motion 
10 compensation predictive difference signal. Besides, Xi, Yi, Xp, Yp, Xb, 
Yb and D are the model parameters of bit quantity. The bit number 
model parameter calculating means 1006 calculates the values by a least 
squares estimation based on the values of the quantization parameters 
for past encoded frames, the MAD and the data on the number of 
15 generated bits. The calculation proceeds as follows: first, substitute 
expressions (10) and (ll) for (8) and (9) to eliminate Qp and Qb, and then, 
assuming that Si = Ti, Sp = T P and S B = Tb, substitute (7), (8) and (9) for 
(12) to achieve the value of Qi. Thus, target bit numbers Ti, Tp and Tb 
for I, P and B frames are calculated. 

20 Qp=K p Q t (10) 

Qb^Qj (11) 

Tj + N P T P + N B T B = R (12) 

In Japanese Patent Application Laid-Open No. 2000-50254, 
there is also disclosed a scheme, in which the bits for a frame estimated 

25 as above are allocated among a plurality of objects included in the frame. 
According to the scheme, the target bit number for a VOP (Video Object 
Plane) is determined by allocating bits for each object proportional to a 
weighting factor, which is the weighted average of the size, motion and 
activity of the object. 

30 First, bit allocation for each frame is decided by using 



expression (13). 



T = X|aS ] + a-cOmax[L,-j^Jj (13) 

In (13), L indicates the number of bits required to assure a minimum 
quality. In addition, m denotes the number of objects, N denotes the 
number of frames which remain to be coded, Sj denotes the number of 
bits generated for the j-th object of the previous frame, and a denotes a 
weighting factor. Here, a value of weighting factor a is 0.2. More 
specifically, remaining available bits R are allocated equally among 
VOPs, and then the allocation of each VOP is adjusted according to the 
number of bits generated in the previous coding. Thus, the total target 
bit number for objects included in a frame is estimated. 

Next, the total target bit number T obtained by (13) is refined 
by buffer processing. After that, the total target bits T are allocated for 
each object according to the weight given by expression (14). 

WsSIZEj + WmMOTj + WvMADj 2 (14) 

In (14), SIZEj, MOTj and MADj indicate the size, motion vector, and MAD 
of the j-th object, respectively. Besides, w s ,w m and w v are weighting 
factors with values of w s = 0.4, w m = 0.6 and w v = 0.0, or w s = 0.25, w m = 
0.25 and w v = 0.5. 

In the conventional rate control techniques, the bit rate is 
estimated with disregard to variations in the number of bits due to 
changing size of each object. In TM-5, this is not a problem since the 
size of a picture plane being subject to encoding stays unchanged. 
However, in object-based coding such as MPEG-4, when there is a 
sudden increase in object's size, bit allocation is not carried out 
successfully toward the end of a GOV (Group Of VOP) despite the 
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substantial increase in the amount of generated bits. Consequently, 
picture quality is deteriorated, and encoding cannot be performed within 
the target bit number. 

In the scheme disclosed in Japanese Patent Application 
5 Laid-Open No. 2000-50254, the object size information is taken into 
account by expression (14) on the occasion of allocating the total target 
bits for a frame among objects included in the frame. However, since 
the information is unconsidered at the point of determining the total 
target by expression (13), the same problems occur. 
;Ij 10 Moreover, in expression (13) of the above scheme for 

irfj estimating the target bit number for each frame, it is assumed that the 

K VOP rate of respective objects is uniform. Therefore, when there is 

rU 

|u* difference in the frame rate among objects, the estimate of the target bit 
J- number for the frame is irrelevant. 

;" = rj 15 Furthermore, in the rate control for video objects according to 

the above scheme, bits are allocated based on weight, which is given as 

□ the linear weighted sum of the size (area), motion and MAD of an object. 
The term of the MAD, however, is independent of the size of the object, 
and may give a large value even when the object is very small. 
20 Consequently, a large amount of bits are allocated for a small object 
when the value of its MAD is large. As a result, the number of bits to 
be allocated for other objects is reduced on the whole, which causes 
deterioration in the picture quality of the other objects. 

25 SUMMARY OF THE INVENTION 

It is therefore an object of the present invention to provide a 
moving picture encoding system capable of bit rate control, by which 
moving pictures are encoded while maintaining high quality even when 
there are substantial changes in the size of objects and the 

30 characteristics of texture. 
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Besides, it is another object of the present invention to provide 
a moving picture encoding system capable of bit rate control, by which bit 
allocation is performed properly even when the frame rate varies with 
each object. 

In accordance with the first aspect of the present invention, there is 
provided a moving picture encoding system for encoding moving picture 
sequences with respect to each object, comprising: a coding means for 
encoding object picture data consisting of time series sequences of video 
object planes (VOPs), each of which is a picture image of at least one 
object at a point of time, and shape information data indicating the shape 
of the object in each VOP while conducting bit rate control so that the 
number of generated bits for each VOP meets a target bit number, and 
outputting coding information including a quantization parameter used 
in encoding and the generated bit number along with obtained bit 
streams; an area calculating means for calculating the area of the object 
in each VOP based on the shape information data, and outputting the 
result as area data; a predictive area calculating parameter extracting 
means for obtaining a function that indicates temporal variations in the 
area of the object based on the history of the area data, and outputting a 
parameter specifying the function or a predictive value of the area 
obtained by the function as a predictive area calculating parameter; a bit 
number model parameter calculating means for calculating a parameter 
used in modeling the generated bit number per unit area of the object 
based on the coding information, the generated bit number and the area 
data, and outputting the result as a bit number model parameter; a 
predictive bit number calculating parameter extracting means for 
obtaining a function that indicates temporal variations in the bit number 
model parameter based on the history of the bit number model 
parameter, and outputting a parameter specifying the function or a 
predictive value of the bit number model parameter obtained by the 
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function as a predictive bit number calculating parameter; and a target 
bit number calculating means which performs a series of processes- 
calculating an uncoded VOP allocatable bit number that is the total 
number of allocatable bits for uncoded VOPs in a certain period of time 

5 based on allocatable bit number information indicating the total number 
of allocatable bits for the VOPs in the certain period of time and the 
number of generated bits for the encoded VOPs in the certain period of 
time, estimating the number of generated bits for the uncoded VOPs 
based on the predictive area calculating parameter and the predictive bit 

10 number calculating parameter, allocating the uncoded VOP allocatable 
bit number, calculating a target bit number for the next VOP to be 
encoded, and outputting the target bit number; sequentially for each of 
VOPs in the certain period of time. 

In accordance with the second aspect of the present invention, there 

15 is provided a moving picture encoding system for encoding moving 
picture sequences with respect to each object, comprising* a storing 
means for temporarily storing object picture data consisting of time 
series sequences of video object planes (VOPs), each of which is a picture 
image of at least one object at a point of time, and shape information 

20 data indicating the shape of the object in each VOP; a coding means 
which reads the object picture data and shape information data out of the 
storing means, encodes the data while conducting bit rate control so that 
the number of generated bits for each VOP meets a target bit number, 
and outputs coding information including a quantization parameter used 

25 in encoding and the generated bit number along with obtained bit 
streams,* an area calculating means for calculating the area of the object 
in each VOP based on the shape information data, and outputting the 
result as area data; a bit number model parameter calculating means for 
calculating a parameter used in modeling the generated bit number per 

30 unit area of the object based on the coding information, the generated bit 



number and the area data, and outputting the result as a bit number 
model parameter; a predictive bit number calculating parameter 
extracting means for obtaining a function that indicates temporal 
variations in the bit number model parameter based on the history of the 
bit number model parameter, and outputting a parameter specifying the 
function or a predictive value of the bit number model parameter 
obtained by the function as a predictive bit number calculating 
parameter; and a target bit number calculating means which performs a 
series of processes: calculating an uncoded VOP allocatable bit number 
that is the total number of allocatable bits for uncoded VOPs in a certain 
period of time based on allocatable bit number information indicating the 
total number of allocatable bits for the VOPs in the certain period of time 
and the number of generated bits for the encoded VOPs in the certain 
period of time, estimating the number of generated bits for the uncoded 
VOPs based on the area data and the predictive bit number calculating 
parameter, allocating the uncoded VOP allocatable bit number, 
calculating a target bit number for the next VOP to be encoded, and 
outputting the target bit number; sequentially for each of VOPs in the 
certain period of time. 

In accordance with the third aspect of the present invention, there is 
provided a moving picture encoding system for encoding each frame of 
moving picture sequences while conducting bit rate control with respect 
to each section of the frame, comprising: a coding means which is 
supplied with picture data, section information data indicating the 
sections in each frame of the picture data and a target bit number for 
each section, encodes the data with respect to each section while 
conducting bit rate control so that the number of generated bits for each 
section meets the target bit number, and outputs coding information 
including a quantization parameter used in encoding and the generated 
bit number along with obtained bit streams; an area calculating means 
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for calculating the area of the section in each frame based on the section 
information data, and outputting the result as area data; a predictive 
area calculating parameter extracting means for obtaining a function 
that indicates temporal variations in the area of the section based on the 
history of the area data, and outputting a parameter specifying the 
function or a predictive value of the area obtained by the function as a 
predictive area calculating parameter; a bit number model parameter 
calculating means for calculating a parameter used in modeling the 
generated bit number per unit area of the section based on the coding 
information, the generated bit number and the area data, and outputting 
the result as a bit number model parameter; a predictive bit number 
calculating parameter extracting means for obtaining a function that 
indicates temporal variations in the bit number model parameter based 
on the history of the bit number model parameter, and outputting a 
parameter specifying the function or a predictive value of the bit number 
model parameter obtained by the function as a predictive bit number 
calculating parameter; and a target bit number calculating means which 
performs a series of processes- calculating an uncoded frame allocatable 
bit number that is the total number of allocatable bits for uncoded 
frames in a certain period of time based on allocatable bit number 
information indicating the total number of allocatable bits for the frames 
in the certain period of time and the number of generated bits for the 
encoded frames in the certain period of time, estimating the number of 
generated bits for each section in the uncoded frames based on the 
predictive area calculating parameter and the predictive bit number 
calculating parameter, allocating the uncoded frame allocatable bit 
number, calculating a target bit number for each section in the next 
frame to be encoded, and outputting the target bit number; sequentially 
for each of frames in the certain period of time. 

In accordance with the fourth aspect of the present invention, there is 
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provided a moving picture encoding system for encoding each frame of 
moving picture sequences while conducting bit rate control with respect 
to each section of the frame, comprising: a storing means for temporarily 
storing picture data, section information data indicating the sections in 
5 each frame of the picture data; a coding means which reads the picture 
data and section information data out of the storing means, encodes the 
data with respect to each section while conducting bit rate control so that 
the number of generated bits for each section meets a target bit number 
for the section, and outputs coding information including a quantization 
H 10 parameter used in encoding and the generated bit number along with 
obtained bit streams; an area calculating means for calculating the area 
.p of the section in each frame based on the section information data, and 
•If outputting the result as area data; a bit number model parameter 
calculating means for calculating a parameter used in modeling the 
15 generated bit number per unit area of the section based on the coding 

■ 

f|| information, the generated bit number and the area data, and outputting 
q the result as a bit number model parameter; a predictive bit number 
calculating parameter extracting means for obtaining a function that 
indicates temporal variations in the bit number model parameter based 
20 on the history of the bit number model parameter, and outputting a 
parameter specifying the function or a predictive value of the bit number 
model parameter obtained by the function as a predictive bit number 
calculating parameter; and a target bit number calculating means which 
performs a series of processes: calculating an uncoded frame allocatable 
25 bit number that is the total number of allocatable bits for uncoded 
frames in a certain period of time based on allocatable bit number 
information indicating the total number of allocatable bits for the frames 
in the certain period of time and the number of generated bits for the 
encoded frames in the certain period of time, estimating the number of 
30 generated bits for each section in the uncoded frames based on the area 
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data and the predictive bit number calculating parameter, allocating the 
uncoded frame allocatable bit number, calculating a target bit number 
for each section in the next frame to be encoded, and outputting the 
target bit number; sequentially for each of frames in the certain period of 
time. 

In accordance with the fifth aspect of the present invention, there is 
provided a moving picture encoding method for encoding moving picture 
sequences with respect to each object, comprising the steps of calculating 
an uncoded VOP allocatable bit number that is the total number of 
allocatable bits for uncoded VOPs in a certain period of time by 
subtracting the number of generated bits for the encoded VOPs in the 
certain period of time from the total number of allocatable bits for the 
VOPs in the certain period of time, estimating the number of generated 
bits for all the uncoded VOPs, calculating a target bit number for the 
next VOP to be encoded by allocating the uncoded VOP allocatable bit 
number, and encoding the VOP; sequentially for each of VOPs in the 
certain period of time. 

In accordance with the sixth aspect of the present invention, there is 
provided a moving picture encoding method for encoding each frame of 
moving picture sequences while conducting bit rate control with respect 
to each section of the frame, comprising the steps of calculating an 
uncoded frame allocatable bit number that is the total number of 
allocatable bits for uncoded frames in a certain period of time by 
subtracting the number of generated bits for the encoded frames in the 
certain period of time from the total number of allocatable bits for the 
frames in the certain period of time, estimating the number of generated 
bits for all the sections in the uncoded frames, calculating a target bit 
number for each section in the next frame to be encoded by allocating the 
uncoded frame allocatable bit number, and encoding the frame; 
sequentially for each of frames in the certain period of time. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
The objects and features of the present invention will become 
more apparent from the consideration of the following detailed 
description taken in conjunction with the accompanying drawings in 
which-" 

Fig. 1 is a block diagram showing a conventional moving 
picture encoding system; 

Fig. 2 is a block diagram showing a moving picture encoding 
system according to an embodiment of the present invention; and 

Fig. 3 is a block diagram showing a moving picture encoding 
system according to another embodiment of the present invention. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 

Referring now to the drawings, a description of preferred 
embodiments of the present invention will be given in detail. 

Fig. 2 is a block diagram showing a moving picture encoding 
system according to an embodiment of the present invention. 

The moving picture encoding system comprises' a coding 
means 101, a target bit number calculating means 102, a predictive area 
calculating parameter extracting means 103, a predictive bit number 
calculating parameter extracting means 104, an area calculating means 
105 and a bit number model parameter calculating means 106. 

The coding means 101 is supplied with the output of the target 
bit number calculating means 102 (target bit number), shape information 
data and object picture data, and generates bit streams as the first 
output. Its second output (generated bit number) is provided to the 
target bit number calculating means 102 and the bit number model 
parameter calculating means 106, and the third output (coding 
information) is provided to the bit number model parameter calculating 
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means 106. 

The area calculating means 105 is supplied with the shape 
information data, and provides the predictive area calculating parameter 
extracting means 103 and the bit number model parameter calculating 
5 means 106 with its output (area data). The predictive area calculating 
parameter extracting means 103, which is supplied with the output of 
the area calculating means 105, provides the target bit number 
calculating means 102 with its output (predictive area calculating 
parameter). 

10 The bit number model parameter calculating means 106, 

where the second and third outputs of the coding means 101, and the 
output of the area calculating means 105 are inputted, provides the 
predictive bit number calculating parameter extracting means 104 with 
its output (bit number model parameter). 

15 The predictive bit number calculating parameter extracting 

means 104 is supplied with the output of the bit number model 
parameter calculating means 106, and provides the target bit number 
calculating means 102 with its output (predictive bit number calculating 
parameter). The target bit number calculating means 102, where 

20 allocatable bit number information, the second output of the coding 
means 101, and the outputs of the predictive area calculating parameter 
extracting means 103 and the predictive bit number calculating 
parameter extracting means 104 are inputted, provides the coding means 
101 with its output (target bit number). 

25 The shape information data indicate the shape of objects. A 

mask picture, in which, for example, object regions are shown with 255 
pixels and the pixel value is set to 0 outside the object, corresponds to the 
data. The object picture data indicate a sequence of picture images of 
the object extracted from each frame. 

30 The shape information data and the object picture data are 
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inputted to the coding means 101, and encoded. The encoding of 
MPEG-4, for example, is applicable as a method of encoding. In this 
case, bit rate control is conducted so that the number of bits for each 
VOP (video object plane) meets a target bit number. After that, bit 
streams obtained by the encoding are outputted together with the 
generated bit number and the coding information. 

The coding information is characteristic values found in 
encoding such as activity or motion compensation predictive difference 
power, the mean absolute difference (MAD) of a motion compensation 
predictive difference signal, the quantization parameter, and motion 
vector information. 

The shape information is also inputted to the area calculating 
means 105. The area calculating means 105 calculates the area of an 
object region by counting pixels included in the object region according to 
the shape information. When the shape information is not binary but 
multivalued, the shape information is binarized before calculating the 
area. Alternatively, the area may be indicated by the number of blocks 
(macrobloks) including the object region instead of the number of pixels. 
The calculated area data are inputted to the predictive area calculating 
parameter extracting means 103 and the bit number model parameter 
calculating means 106. 

The predictive area calculating parameter extracting means 
103 finds a function approximate to the temporal variations of the object 
area based on the history of the area data supplied from the area 
calculating means 105. The predictive area calculating parameter 
extracting means 103 then outputs a parameter required to describe the 
function as a predictive area calculating parameter. 

For example, the object area A(t) at the time t (or VOP 
number) in a VOP is approximated by n-th order expression (15). 



A(t) = £ Pj t J (15) 

3=0 



In (15), parameter pj (j = 0, — , n) is the predictive area calculating 
parameter. These values are calculated by the least squares method etc. 
according to the values of the object area obtained in previous VOPs. 
Besides, the predictive area calculating parameter may be the actual 
predictive value of the object area in each VOP to be coded obtained by 
expression (15). In addition, it is also possible that the several types of 
functions are previously designated, and an index that specifies a 
function is outputted together with a parameter as the predictive area 
calculating parameter. 

In this case, a function, which most approximates variations of 
the area, is found from among the designated types of functions, and a 
parameter thereof is obtained. Then, an index that indicates the type of 
the function and the parameter are outputted to the target bit number 
calculating means 102 as the predictive area calculating parameter. 

On the other hand, the generated bit number and the coding 
information outputted from the coding means 101 are inputted to the bit 
number model parameter calculating means 106. The operation of the 
bit number model parameter calculating means 106 is basically the same 
as that of the bit number model parameter calculating means 1006, in 
which the relationship between the generated bit number and the coding 
information is expressed in a mathematical model, and a parameter to 
describe the model is calculated as a bit number model parameter. 
However, in the calculation of the bit number model parameter, the bit 
number model parameter calculating means 1006 models the number of 
bits for a whole picture plane, while the bit number model parameter 
calculating means 106 models the number of bits per unit area. 

Consequently, the area data calculated by the area calculating 
means 105 are also inputted to the bit number model parameter 
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calculating means 106 to be used in the calculation of the bit number 
model parameter. Incidentally, the model employed in the calculation 
is not restricted to the model obtained by (l) to (3), or (7) to (9), but the 
other models may be used. 

Subsequently, the obtained model parameter is outputted to 
the predictive bit number calculating parameter extracting means 104. 

The predictive bit number calculating parameter extracting 
means 104 finds a function approximate to the temporal variations of the 
model parameter based on the history of the bit number model 
parameter supplied from the bit number model parameter calculating 
means 106. The predictive bit number calculating parameter extracting 
means 104 then outputs a parameter required to describe the function as 
a predictive bit number calculating parameter. 

For example, the bit number model parameter X(t) at the time 
t (or VOP number) in a VOP is approximated by n-th order expression 
(16). 

X(t) = ^q } V (16) 

j=0 

In (16), parameter qj (j = 0, n) is the predictive bit number calculating 
parameter. These values are calculated by the least squares method etc. 
according to the values of the bit number model parameter obtained in 
previous VOPs. Besides, the predictive bit number calculating 
parameter may be the actual predictive value of the bit number model 
parameter in each VOP to be coded obtained by (16). In addition, it is 
also possible that the several types of functions are previously designated, 
and an index that specifies a function is outputted together with a 
parameter as the predictive bit number calculating parameter. 

The predictive area calculating parameter outputted from the 
predictive area calculating parameter extracting means 103 and the 
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predictive bit number calculating parameter outputted from the 
predictive bit number calculating parameter extracting means 104 are 
inputted to the target bit number calculating means 102. The target bit 
number calculating means 102 is also supplied with the allocatable bit 
5 number information and the generated bit number outputted from the 
coding means 101. 

The target bit number calculating means 102 allocates 
allocatable bits in a certain period of time for each VOP in the certain 
period of time by using the predictive area calculating parameter and the 
10 predictive bit number calculating parameter, and thereby obtains the 
target bit number for each VOP. The number of allocated bits for each 
VOP is outputted as a target bit number from the target bit number 
calculating means 102. 

The certain period of time corresponds a time interval taken 

15 for a GOV. When the bit rate control is executed with respect to each 
GOV, first, the number of available bits for a GOV is calculated based on 
the allocatable bit number information. A coding bit rate specified by a 
user, for example, is used as the allocatable bit number information. In 
this case, the number of available bits for a GOV is calculated by dividing 

20 the number of allocatable bits according to the information by a VOP rate, 
and then multiplying the quotient by the number of VOPs in the GOV. 

Next, when there are VOPs, which have already been encoded, 
in the GOV, the number of bits used for encoding them is subtracted 
from the number of available bits for the GOV to find remaining 

25 available bits. Subsequently, the target bit number for the next VOP to 
be coded is calculated in consideration of the allocation of the available 
bits for remaining VOPs in the GOV. 

In the following, the process of calculating target bit numbers 
Ti, T P and T B for I, P and B VOPs will be described using the bit number 

30 models obtained by (1) to (3). Incidentally, while (l) to (3) are applied to 
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modeling the bit number for a whole picture plane in the 
above-mentioned conventional scheme, in this embodiment, (l) to (3) are 
applied to modeling the bit number per unit area. 

Assuming that: the target bit number per unit area of I VOP is 
5 T, the time for I VOP is ti, the area at time t is A(t), and the complexity 
indexes for I, P and B VOPs are Xi(t), X P (t) and X B (t), respectively; target 
bit numbers Ti, T P and T B of the I, P and B VOPs at time t are expressed 
as follows^ 



10 T I (t I ) = TA(t I ) (17) 



X B (t) 

15 * BW X^K 



T B (t) = ^, B ; T ; ta(o (i9). 



In (17), (18) and (19), K P and Kb denote parameters to control the 
roughness of quantization for P and B pictures similarly to the system of 
TM-5. Assuming that the number of remaining available bits is R, 
20 expression (20), which is a generalized expression of (12), can be formed. 



£T I (t)+2T P (t)+2T B (t) = R (20) 

tSTj t€Tp tST B 

In (20), ti, rpand x b denote a set of time when the type of VOP is I, P 
25 and B, respectively, in the remaining VOPs included in a GOV. 

Incidentally, when there is no VOP of corresponding type in the 

remaining VOPs, each of n, x p and x b is a null set. 

Thereby, the target bit number for I VOP at time ti can be 

expressed by expression (21). Similarly, the target bit numbers for P 
30 VOP at time t P and B VOP at time t B are calculated by following 



20 



expression (22) and (23). 



(21) 



X, (t r )A(t , ) + J- £ X P (t)A(t) + J- 2 X B (t)A(t) 



T P (tp) = 



X P (tp)A(tp)R 



(22) 



£X P (t)A(t) + 




SX B (t)A(t) 



X B (t B )A(t B )R 



(23). 



SX B (t)A(t) + 




XX P (t)A(t) 



In the target bit number calculating means 102, the values of 
the area and the complexity index concerning the remaining VOPs 
included in the GOV are calculated first using the predictive area 
calculating parameter and the predictive bit number calculating 
parameter. Then, the target bit number is calculated by an expression 
selected from (21) to (23) according to the type (I, P or B) of the next VOP 
to be coded using the obtained values of the area and the complexity 
index. 

Besides, the quadratic bit number models shown by (7) to (9) 
may also be employed. In this case, functions that indicate the 
temporal variations of bit number model parameters Xi, Yi, X P , Y P , X B , Y B 
and D are found, and parameters to describe the functions are the 
predictive bit number calculating parameters. Incidentally, while (7) to 
(9) are applied to modeling the bit number for a whole picture plane in 
the above-mentioned conventional scheme, in this embodiment, (7) to (9) 
are applied to modeling the bit number per unit area. Calculation 
process is the same as the above case. 

The calculated target bit number is inputted to the coding 
means 101, and used for encoding the next VOP. 
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In the above description, the target bit number is determined 
based on only the bit counts for texture, however, it may be determined 
in consideration of bits for motion and shape. 

In this case, the coding means 101 counts the number of 
generated bits for the motion, shape and texture individually, and 
outputs the results as the generated bit number. 

The bit number model parameter calculating means 106 
calculates parameters for modeling the bit numbers for the motion and 
shape per unit area in addition to the bit number model parameter 
concerning the texture. The bit number model parameter concerning 
the motion/ shape is simply calculated by dividing the bit number for the 
motion/ shape by the area according to the area data supplied from the 
area calculating means 105, and outputted. 

The predictive bit number calculating parameter extracting 
means 104 finds functions approximate to the temporal variations of the 
model parameters based on the history of the bit number model 
parameters concerning the motion and shape. The predictive bit 
number calculating parameter extracting means 104 then outputs 
parameters required to describe the functions as predictive bit number 
calculating parameters along with the bit number model parameter 
concerning the texture. 

The target bit number calculating means 102 calculates the 
predictive amounts of the motion and shape bits per unit area for 
remaining VOPs by using the predictive bit number calculating 
parameters concerning the motion and shape. Subsequently, each 
result is multiplied by the predictive value of the area estimated 
according to the predictive area calculating parameter outputted from 
the predictive area calculating parameter extracting means 103, and 
thus the predictive numbers of the motion and shape bits are calculated. 
Then, the sum of the predictive values for the remaining VOPs is 



22 



subtracted from the remaining allocatable bit number R, and the result 
is set anew as R in (21) to (23) to calculate the target bit number for the 
texture. Thus the bit rate control, in which the number of bits for the 
motion and shape are also taken into account, can be performed. 
5 While in the above description, each object is encoded under 

the bit rate control individually, it is possible that plural objects are 
processed at the same time, and then the target bit rates for VOPs of 
each object are decided. 

Assuming that the number of objects is J, the shape 

10 information data and object picture data supplied to the coding means 
shown in Fig. 2 are the data of J objects. Likewise there are the area 
data, the predictive area calculating parameter, the coding information, 
the generated bit number, the bit number model parameter, the 
predictive bit number calculating parameter, the target bit number, and 

15 the bit streams for each of the J objects. The coding means 101, the 
area calculating means 105, the predictive area calculating parameter 
extracting means 103, the bit number model parameter calculating 
means 106 and the predictive bit number calculating parameter 
extracting means 104 execute the above-mentioned processes for each of 

20 the J objects. 

On the other hand, the target bit number calculating means 
102 calculates the target bit number for the individual J objects based on 
the allocatable bit number, and the predictive area calculating parameter, 
the predictive bit number calculating parameter and the generated bit 

25 number for the individual J objects. In the following, the description 
will be given of the calculation procedure of the target bit number for 
each object in this case. 

The quantization parameter for I VOP of the j-th object can be 
expressed as follows- 

30 Qi, j = MjQ (24). 
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In (24), Qi, j denotes the quantization parameter for I VOP of the j-th 
object, and Q denotes the standard quantization parameter. The value 
of Q varies depending on the allocatable bit number. Besides, Mj 
denotes the constant for adjusting the roughness of quantization. When 
5 Mj is set to 1 for all objects, all the objects are quantized in the same 
roughness level. In other words, it is possible to make variations in the 
roughness of quantization among objects by the value. For example, 
quantization can be controlled so that visually important objects are 
quantized finely and the others are quantized roughly. In the following, 
10 the process of calculating the target bit number for each object based on 
Q set to accord with the allocatable bit number will be described. 

The quantization parameters for P and B VOPs of the j-th 
object Qp, j and Qb, j are obtained by expressions (25) and (26) by using 
(24). 

15 Qp, j = Kp, j Qi, j = K P , j MjQ (25) 

Qb, j = Kb, j Qi, j = Kb, j MjQ (26) 

In (25) and (26), K P , j and Kb, j denotes the ratio of the quantization 
parameters for P and B VOPs to the quantization parameter for I VOPs. 
Thereby the target bit numbers for I, P and B VOPs of the j-th object are 

20 defined by following expressions (27), (28) and (29), respectively. 

T I , J ( tl )^ Aj(t ) = ^|> A , (t ) (27) 

WO^CO^A^) ( 28 > 

^■^■^ w (29) - 



30 



Assuming that, n, j, t p, j and z b, j denote a set of time when 
a VOP is I, P and B VOP types, respectively, in remaining VOPs included 
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in the GOV for the j-th object, expression (30) can be formed. 

Z| E T n &>+ Z t p, (t)+ E t b, (t)l=R (so) 

Incidentally, when there is no VOP of the above types in the remaining 
VOPs, each of n, z P and z b is a null set. By substituting (27) to (29) 
for (30) and solving Q, the expression (31) can be achieved. 



^ = ^W\ XXi-A^to+jT- Z x p.i<t>A J (t)+=^- zx BJJ (t)A/t) .........(si) 



Further, by substituting (31) for (27) to (29), the expressions 
(32), (33) and (34) are obtained 



T I , k (t I ) = - 



M *X^| IX I(J (t)A J W + i- 2Xp.,(t)A J (t) + ^- XX B)J (t)A J (t) 



--(32) 



Tpv(t P ) = 



X P>k (tp)A k (tp)R 



Kp^kE^ EX I J(t)A J (t) + -i- XX P>J (t)A ; (t) + - ] - 2x BfJ (t)A J (0] 



•(33) 



^B,K (*B ) — 



X B;k (t B )A k (t B )R 



K B. k M K 2;Tj-fzx I , J (t)A J (t) + -^- 2X P ,(t)A J (t) + - ] - ^.^(t)] 

j IVA 3 ^tex,,, JVp,j tetp,, ^ B >j tet BlJ J 



-(34) 



Thus the target bit number for each object is calculated and outputted to 
the coding means. 

As is described above, even when there are plural objects in a 
frame, bit allocation can be performed properly. Besides, the target bit 
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numbers obtained by (32) to (34) are applicable to the case where the 
frame rate varies with each object. 

Next, another embodiment of the present invention will be 
described referring to Fig. 3. This embodiment can be applied to the 
case where delay is admissible in encoding. Basically, the moving 
picture encoding system of this embodiment has the same configuration 
as that illustrated in Fig. 2 except that it does not include the predictive 
area calculating parameter extracting means 103 and is provided with a 
buffer 200. 

In addition, a target bit number calculating means 202 and a 
bit number model parameter calculating means 206 replace the target bit 
number calculating means 102 and the bit number model parameter 
calculating means 106, respectively. 

The buffer 200 is supplied with shape information data and 
object picture data, and its first output (shape information data) and the 
second output (object picture data) are sent to the coding means 101. 

The area calculating means 105 is supplied with the shape 
information data before being inputted to the buffer 200, and its output 
(area data) is sent to the target bit number calculating means 202 and 
the bit number model parameter calculating means 206. 

The target bit number calculating means 202 is provided with 
outputs of the predictive bit number calculating parameter extracting 
means 104 and the area calculating means 105, allocatable bit number 
information, and the second output of the coding means 101 (generated 
bit number), and its output is sent to the coding means 101. In other 
respects, the moving picture encoding system of this embodiment is the 
same as the system shown in Fig. 2. 

The buffer 200 stores the inputted shape information data and 
the object picture data as long as a period of unitary time for the bit rate 
control (for example, a span of a GOV). The coding means 101 of this 
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embodiment operates in the same manner as that in Fig. 2. 

The operation of the bit number model parameter calculating 
means 206 is basically the same as that of the bit number model 
parameter calculating means 106 in Fig. 2, except that the bit number 
5 model parameter calculating means 206 stores the area data outputted 
from the area calculating means 105 for a certain period of time, and 
calculates the bit number model parameter by using the area data of a 
VOP corresponding to the generated bit number outputted from the 
coding means 101. 

1° The target bit number calculating means 202 calculates the 

target bit number in the same manner as the target bit number 
calculating means 102 in Fig. 2. However, the target bit number 
calculating means 202 uses actual area data calculated previous to 
encoding by the area calculating means 105 for the calculation of the 

15 target bit number, while the target bit number calculating means 102 
uses a predictive value of the area A(t) for the calculation. 

The operation of other parts is the same as that of moving 
picture encoding means shown in Fig. 2. 

In the moving picture encoding system illustrated in Fig. 3, the 

20 target bit number calculating means 202 calculates the target bit 
number not by a predictive value of the area but by actual area data. 
Consequently, bit allocation can be performed with a higher degree of 
accuracy compared to the moving picture encoding system illustrated in 
Fig. 2. 

25 While a description has been given of a system in which objects 

are encoded individually, the moving picture encoding system of the 
present invention is applicable to the bit allocation for encoding a frame 
that is a combination of objects without making a separation into each 
object. In this case, a frame is divided into sections for each object, and 

30 the bit rate control is performed with respect to each section. That is, 
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coding model parameters are calculated for each section corresponding to 
an object, and thereby the target bit number for each section is 
calculated in the same manner as the above-mentioned calculation of the 
target bit number for each object. Thus, each section is encoded under 
the individual bit rate control according to the target bit number. On 
this occasion, it is not necessary to encode shape information. The 
shape information is used for the purpose of the bit rate control only. 

Consequently, even when the area of a visually important 
object undergoes a lot of changes in a sequence, available bits can be 
allocated in a balanced manner. Besides, the moving picture encoding 
system of the present invention is also applicable to the case where a 
picture plane is divided into sections and the bit rate control is performed 
for each section to encode a sequence of frames not including plural 
separated objects. To take a screen of a picture phone as an example, 
the region carrying a picture of a person and the other regions are 
divided, and the target bit number is calculated likewise with respect to 
each region. Thus, it is made possible to perform the bit rate control 
according to the importance of each region. 

Additionally, the moving picture encoding system according to 
the present invention may be implemented by reading a program for 
executing its operation stored in a storing medium such as a CD-ROM, a 
floppy disk or a nonvolatile memory card by a computer. 

While the preferred embodiments of the present invention 
have been described using specific terms, such description is for 
illustrative purposes only, and it is to be understood that changes and 
variations may be made without departing from the spirit or the scope of 
the following claims. 



